feat(maintainers): serve-time role resolution from a live maintainers table (+ label/review view perf fix)#192
Merged
Conversation
…s table PR #190 reconciled author_association by rewriting the stored column across 4 tables hourly — write-amplification, an hour staleness window, and it corrupts the event-time meaning of the snapshot. Separately, the per-row label/review association resolution went through contributor_repo_roles, a plain view re-derived per row (~30ms each: sort + DISTINCT ON over ~8k rows), which times the validator out on high-volume miners. Both collapse into one primitive: a live, indexed maintainers table. - Add maintainers (repo_full_name, github_id, login, association, refreshed_at), PK (repo_full_name, github_id); repo_full_name stored lowercased so reads join as m.repo_full_name = LOWER(src.repo_full_name) and still hit the PK. - Convert the reconcile service into MaintainerPopulateService: same fetcher (direct collaborators + org members), same safety rules (fetch-before-write, fail-closed-per-repo, skip-on-empty), but atomically upserts the maintainers table per repo instead of rewriting 4 tables. Runs hourly + once on boot. - Repoint pr_labels_by_actor / issue_labels_by_actor actor_association and pr_linked_issues issue_author_association and pr_review_summary's maintainer filter off contributor_repo_roles onto maintainers (indexed lookup; ~10-20x). - Resolve PR/issue author_association at serve time in miners.service via COALESCE(maintainers.association, stored) — no new payload field, so the validator needs no change. - Repoint /repos/:repo/maintainers to the table. Mark stored association columns as ingest snapshots; leave contributor_repo_roles in place (no remaining hot-path consumers).
entrius
approved these changes
Jun 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
author_association/reviewer_associationare snapshotted at ingest and never refreshed, so role-based scoring (issue-bonus tier, PR-drop gate, maintainer review gate) reads stale roles. PR #190 patched this by rewriting the stored columns across 4 tables hourly — but that's the wrong layer: write-amplification, an hour staleness window, registered+installed-only coverage, and it corrupts the event-time meaning of the stored snapshot.Separately, the per-row label/review association resolution went through
contributor_repo_roles— a plain view re-derived per row (EXPLAIN ANALYZE: ~30ms each, a sort +DISTINCT ONover ~8k rows across pull_requests/issues/reviews/comments, 1–2× per PR). That's the ~70ms/row that times the validator out on high-volume miners.Both collapse into one primitive: a live, indexed
maintainerstable.What
maintainers (repo_full_name, github_id, login, association, refreshed_at), PK(repo_full_name, github_id).repo_full_namestored lowercased so every read joinsm.repo_full_name = LOWER(<src>.repo_full_name)and still uses the PK index.MaintainerPopulateService: reuses the same fetcher (fetchRepoCollaborators+fetchOrgMembers) and safety rules (fetch-before-write, fail-closed-per-repo, skip-on-empty), but atomically upsertsmaintainersper repo (transactioned) instead of mutating 4 tables. Hourly@Cron+ once onOnModuleInit.contributor_repo_rolesontomaintainers:pr_labels_by_actor/issue_labels_by_actor(actor_association),pr_linked_issues(issue_author_association), andpr_review_summary(the maintainer CHANGES_REQUESTED filter). Indexed lookup instead of a per-row re-derivation → ~10–20× on the hot path.miners.service.tsviaCOALESCE(maintainers.association, stored)on PR/issue authors — resolves onto the existingauthor_associationfield, no new payload field, no gittensor change. Safe because the validator only ever tests… in MAINTAINER_ASSOCIATIONS, so maintainers-only (present → role, absent → NULL/stored) is lossless./repos/:repo/maintainersto the table. Stored association columns marked as ingest snapshots;contributor_repo_rolesleft in place (no remaining hot-path consumers).Validation
Loaded the full
packages/dbschema (incl.11_maintainers+ repointed views 21/22/24/25) into a throwaway Postgres and seeded a stale-snapshot case. All pass:CONTRIBUTOR, in maintainers → servedpr_labels_by_actoractor: maintainer / non-maintainerpr_review_summarymaintainer CR count (stale storedreviewer_association)/maintainersmixed-case repo inputnpm run build,lint,format:checkclean inpackages/das. No new deps (no lockfile change). No test files (team rule).SQL migration checklist (prod) — run in this order
1. Create the table (additive, safe anytime) — body of
packages/db/11_maintainers.sql:2. Deploy the app build. Safe with an empty table: serve-time
COALESCE(m.association, stored)falls back to stored; views not yet swapped.OnModuleInittriggers an immediate populate.3. Verify the table populated (seconds after boot; else await/trigger one hourly run):
4. Cut over the views (instant
CREATE OR REPLACE VIEW; paste the new bodies from this PR):21_view_pr_review_summary.sql22_view_pr_linked_issues.sql24_view_pr_labels_by_actor.sql25_view_issue_labels_by_actor.sql5. Verify serving + perf:
GET /repos/phase-rs/phase/maintainersreturns the live set.1388610/59729252shows MEMBER/COLLABORATOR;SELECT author_association …still shows the unmutated snapshot.6. Rollback (if needed):
CREATE OR REPLACE VIEWthe four views back to theircontributor_repo_roles/stored bodies (originals are onorigin/test). Table + app can stay (harmless).Follow-ups